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ABSTRACT 

'*  Durinq  the  Sprinq  semester  of  1981 ,  the  Mathematics  Research  Center  held 
a  weekly  statistical  discussion  series  as  a  precursor  to  its  special  year  on 
Scientific  Inference,  Data  Analysis,  and  Robustness.  The  many  discussants 
included  G.  E.  P.  Box,  D.  V.  Lindley,  B.  W.  Silverman,  A.  Herzberq,  C.  F.  Wu, 
B.  Joiner  and  D.  Rubin.  Many  aspects  of  statistics  were  discussed,  includinq 
the  Box  philosophy  of  deductive  and  inductive  reasoninq,  and  Lindley's 
coherent  Bayesian  viewpoint.  The  present  paper  attempts  to  constructively 
review  the  discussion  series,  and  to  add  a  number  of  retrospective  comments 
and  suqqestions. ^ 


AMS  (MOS)  Subject  Classifications:  62-06,  62A15 

Key  Words:  Modellinq,  Inductive,  Deductive,  Bayes,  Frequentist, 
Siqnificance  Testinq 

Work  Unit  Number  4  (Statistics  and  Probability) 


'7- 


SIGNIFICANCE  AND  EXPLANATION 


There  are  two  main  types  of  statistical  reasoning.  Deductive  reasoning 
is  concerned  with  inferences  conditional  upon  the  truth  of  the  model,  whilst 
induction  relates  to  model  formulation  and  scientific  discovery.  During  the 
MRC  Statistical  Discussion  Series,  in  the  Spring  of  1981,  a  variety  of  aspects 
of  this  and  related  philosophies  were  discussed.  Topics  covered  include 
Checking  Models,  the  Likelihood  Principle,  Principles  on  Model  Space, 
Significance  Testing,  Scientific  Discovery,  Data  Analysis,  Randomization, 
Robust  and  D-Optimal  Designs,  Data -Hand ling.  Subjective  Probability  for  No- 
Data  Problems,  How  Statistics  Should  Be  Taught  e.g.  on  Short  Courses, 
Seguential  Analysis,  Assessing  Prior  Predictive  Distributions,  Rounding  Errors 
in  Regression,  Exchangeability  in  Statistics,  the  Future  of  Statistics,  and 
Statistical  Ethics.  In  the  present  paper  these  discussions  are  critically 
reviewed,  and  some  further  suggestions  made. 


The  responsibility  for  the  wording  and  views  expressed  in  this  descriptive 
summary  lies  with  MRC,  and  not  with  the  author  of  this  report. 


SOME  PHILOSOPHIES  OP  INFERENCE  AND  MODELLING 


Tom  Leonard 


SECTION  1;  SESSIONS  1  TO  3  WITH  FURTHER  IDEAS  ON  MODELLING, 
SIGNIFICANCE  TESTING  AND  SCIENTIFIC  DISCOVERY. 
Session  1;  Checking  Models,  George  Box  (1/23/81) 

In  the  first  session  the  deductive  and  inductive 
aspects  of  statistical  investigation  were  discussed. 
Deauction  is  appropriate  for  inferences  upon  the  truth  of 
the  model,  whilst  inductive  thought  is  necessary  during 
model  checkinq.  During  the  semester  it  became  apparent  that 
all  serious  discussants  were  in  agreement  on  this  issue 

There  was  a  bit  less  aareement  on  which  philosophy 
should  be  employed  during  the  model  checkinq  procedure. 
Discussants  seemed  to  split  into  the  followinq  three  main 
areas  * 

(a)  Baves  is  good  for  inferences  qiven  the  model  but 
frequentist  procedures,  e.q.  significance  tests,  are 
necessary  when  checkinq  the  model. 

(b)  Bayes  is  qood  for  inferences  qiven  the  model,  and 
Bayes  is  also  qood  for  mode 1-checkinq  (e.q.  prior 
distributions  on  either  sampling  densities  or  different 
models  or  polynomial  coefficients)  but  more  Bayesian  theory 
needs  to  be  developed  in  the  model-checking  area. 

(c)  Frequentist  procedures  are  adequate  for  both 
inferences  and  model-checking. 

The  main  debates  were  between  (a)  and  (b).  Frequentist 
model-checking  needs  few  assumptions  about  alternative 
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models,  whilst  Bayesian  assumptions  always  reduce  to  a  qrand 
model  involving  models  across  models.  Therefore  frequentist 
model-checkers  can  point  to  the  simplicity  and  qenerality  of 
their  approach,  whilst  Bayesians  could  qive  the  response 
that  it  is  always  necessary  to  inject  a  certain  amount  of 
structure  into  the  analysis  in  order  to  focus  upon  precise 
conclcusions .  This  is  an  important  issue  which  was  largely 
unresolved. 

The  Inductive  Modellinq  Process  (IMP)  and  the 
subsidiary  importance  of  coherence  were  discussed,  with 
responses  by  D.  V.  Lindley,  A.  F.  M.  Smith,  and  others,  in 
my  paper  in  the  recent  volume  on  "Bayesian  Statistics" 
issued  by  the  University  of  Valencia  Press. 

Session  2;  The  Likelihood  Principle,  Tom  Leonard  (1/30/61) 

In  the  second  session  the  Likelihood  Principle  was 
introduced  in  the  context  of  makinq  inferences  conditional 
upon  the  truth  of  the  model,  and  the  proof  of  Birnbaum's 
theorem  was  presented.  This  proves  that  if  the  statistician 
accepts  the  sufficiency  and  conditionality  principles  (which 
are  open  to  straightforward  frequentist  interpretations) 
then  he  must  accept  the  Likelihood  Principle,  conditional 
upon  the  truth  of  the  model,  and  should  not  therefore  employ 
any  approach  involving  integrations  across  the  sample  space 
(e.g.  UMVU  estimation,  confidence  intervals,  significance 
tests  ) . 

The  reaction  to  these  ideas  was  interesting.  Bayesians 
viewed  the  sufficiency  and  conditionality  principles  as 
obviously  acceptable.  Traditional  significance  testers  felt 
that,  since  the  Likelihood  principle  and  testinq  are  not 
compatible,  there  must  be  somethinq  misleadinq  in  the 
underlying  assumptions  (most  likely  the  Conditionality 
principle).  Another  expressed  view  was  that  the  Likelihood 
Principle  is  largely  irrelevant  since  it  conditions  on  the 
truth  of  the  model,  whilst  most  of  the  statistician's  effort 
needs  to  be  spent  on  mode 1-bui  Id i nq .  One  nice 
interpretation  was  that  "if  your  analysis  does  not  satisfy 
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the  Likelihood  Principle  than  this  means  that  your  mode  1  is 
wronq" . 

Overall,  it  seemed  that  few  existinq  views  were  chanqed 
by  this  exposure  to  Birnbaum's  theorem.  This  may  be  viewed 
as  surprisinaf  as  the  proof  and  underlyinq  assumptions  for 
this  theorem  are  extremely  plausible  and  simple. 

Session  3:  Tying  together  the  ideas  of  the  last  two 
sessions,  Dennis  Lindley, 

In  the  third  session  an  attempt  was  made  to  extend  the 
Likelihood  Principle  ideas  from  the  inferential  to  the 
modelling  situation.  This  involved  a  prior  distribution 
across  model  space,  and  the  ideas  therefore  needed  to  be 
partly  interpreted  from  a  Bayesian  point  of  view.  The  main 
reactions  were  either  (a)  a  Likelihood  Principle  would  be 
neither  reasonable  nor  desirable  in  modelling  situations 
since  frequentist  ideas  are  obviously  more  appropriate  on 
model  space,  or  (b)  these  ideas  would  be  desirable  in 
modelling  situations  but  some  further  theoretical 
development  would  be  needed  in  order  to  obtain  a  modelling 
principle  with  the  same  impact  for  non-Bayesians  as  the 
Likelihood  Principle  for  inference. 

Perhaps  the  discussants  in  Session  3  mioht  have 
favourably  considered  the  following  principles 
The  Modelling  Principle  (special  case  of  Sufficiency 
Principle  and  of  the  Likelihood  Principle) 

Suppose  that  the  outcome  of  an  experiment  is  the 
numerical  realization  x  takinq  values  in  a  sample  space 
H.  Let  f  ^  ( •  )  and  f2(»)  be  two  Probab^^ty  densities 
defined  on  H  (with  respect  to  an  appropriate  dominating 
measure)  such  that 

f , (x)  “ 

Then  unless  there  is  information  external  to  the  data 
to  suggest  otherwise,  neither  of  f^  and  f2  should  be 
viewed  as  preferable  for  modelling  conclusions  based  on  x . 

N.B.  For  each  i,  f.(x)  is  a  probability  density, 

l  '*■ 

conditional  on  the  unknown  "parameter"  f^.  This  may  also 
be  interpreted  as  the  likelihood  functional  of 


conditional  on  x.  The  Modellinq  Principle  is  sayinq  that 
if  f1  and  f^  posses  the  same  likelihood  functional  then 
they  should  be  viewed  as  equally  preferable  for  modellinq 
conclusions  based  on  x ,  in  the  absence  of  external 
information. 

The  Modellinq  Principle  could  he  used  to  critically 
interpret  well-known  modellinq  approaches  due  to  Tukey  and 
Parzen.  Whilst  Rox's  modellinq  approach  is  particularly 
well-formulated;  the  followinq  example  is  interestinq.  Note 
that  problems  with  events  of  probability  zero,  whilst 
unimportant,  could  be  removed  by  extendinq  the  Modellinq 
Principle  to  say  that,  in  the  absence  of  external 
information,  f1  should  he  preferred  to  f2  whenever 
f 1 (x)  >  f2(x). 

Example  -  Box's  Modellinq  Approach 

‘  T 

Consider  an  observation  vector  x  =  ( x  x  ) 

~  i  n 

assuminq  values  in  the  sample  space  ft  which  we  take  to  be 
n-dimensional  Euclidean  space  Rn.  Suppose  that 
g(  )  :  R  ♦  R  is  some  monotonic  transformation  on  the  real 

line  (e.q.  a  Box-Cox  transformation).  Assume  further  that 
the  observed  elements  of  x  happen  to  satisfy  the  specific 
cond i tion 

_  ,  _  ,  „  3q(x  ) 

l  Xj[  =  l  q  ( x  A  )  -2  l  loq  - 1  =  S  (*) 

i 

Consider  the  alternative  models  M1  and  M-,  specified  by 
M j :  The  x^  are  realizations  of  independent  random 

variables  which  possess  standard  normal  distributions. 

The  cor respondi nq  probability  density  is 

f  ,  =  — \,  e  xp{  -  t  I  x  ^ }  for  %  eft 

( 2if  )  X2  n  2  i  1 

Mj,:  The  are  realizations  of  independent  random 

variables  X^  such  that  the  transformed  variables 
Y  =  q(X^)  possess  standard  normal  distributions.  The 
probability  density  is  now  qiven  by 


I 


f2(*>  “  7T?/2neXpI-  \  \  q2(xi)  +  *  loq 

{  2»  )  *  l  x 

for  x  6  fl  . 

Note  that  under  condition  (*)  f.(x)  ■  f_(x)  so  that 

1  ~  2  ~ 

whenever  x  satisfies  (*),  the  Modelling  Principle  tells  us 
to  prefer  f^  and  f2  equally  in  the  absence  of  external 
information. 

The  Box  modelling  approach  tells  us  to  discredit  M ^ 
if  the  tail  probability 

e,  *  p<V*)  <  fi(^n 

-  P(I  X*  >  s2) 

is  to  small,  where  the  probability  on  the 
arises  from  the  distribution  of  X  under 
specified  in  (*).  Note  that  0  is  just 
that  a  chi-squared  random  variable,  with 
freedom  is  greater  than  or  equal  to  S2. 

We  should  also  discredit  »2  if 

®  2  -  <  f2(^n 

-  p(I  X*  -  2  £  log  }-yx  -  --I  >  s2 

is  to  small,  where  the  probability  on  the  right  hand  side  is 

now  based  upon  the  distribution  of  X  under  M,.  Since 

.  2  ~  * 

1  possesses  a  chi-squared  distribution  with  n  degrees 

of  freedom,  and  this  is  adjusted  by  an  extra  function  of 

£,  we  see  that  0^  will  not  in  general  be  the  same  as 

0^.  Therefore,  although  the  Modelling  Principle  tells  us 

to  equally  prefer  and  M2,  we  seem  to  arrive  at 

different  tail  probabilities  in  each  case. 

Our  overall  conclusion  is  that  Box's  Modelling  Approach 

and  the  Modelling  Principle  are  not  in  mathematical 

agreement.  This  may  be  the  source  of  some  discussion. 


right  hand  side 
M y  ,  and  S  is 
the  probability 
n  degrees  of 


<?• 

■  TJk 
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Perhaps  the  philosophy  "all  principles  are  there  to  be 
broken,  but  in  this  case  we  may  learn  a  qreat  deal  by 
considering  whey  we  have  broken  them"  may  be  useful  here. 
Significance  Testing 

During  the  first  three  sessions  a  large  amount  of  time 
was  spent  discussing  the  merits  of  significance  testing. 
There  was  some  measure  of  agreement  that  these  are 
unreasonable  given  the  model  but  much  less  agreement  in  the 
modelling  situation.  The  main  points  raised  by  objectors  to 
significance  tests  were 

(a)  Fixed  size  tests  are  fairly  arbitrary  and  it  seems 
to  he  extremely  difficult  to  interpret  the  magnitude  of  the 
p-value  when  so  many  different  aspects  like  sample  size, 
model  complexity,  and  selective  reporting  affect  the 

p-va lue . 

(b)  There  is  no  justification  for  making  accept/reject 
decisions  hased  on  significance  tests. 

(c)  It  is  dangerous  to  summarize  the  results  of  an 
experiment  by  a  single  p-value. 

(d)  In  modelling  situations  it  is  necessary  to  have 
alternatives  in  mind;  standard  tests  for  fit  do  not  involve 
alternative  models  and  may  therefore  not  be  based  upon 
enouqh  assumptions  to  facilitate  useful  conclusions. 

Proponents  of  significance  tests  made  the  following 
points : 

(a)  The  p-value  can  he  interpreted  very  naturally 
either  by  thinking  in  terms  of  the  tail  area  of  the  sampling 
distribution  or  by  comparison  with  the  p-values  of  other 
experiments.  Interpretations  hased  upon  surprise  factors 
are  particularly  important. 

(h)  When  the  majority  of  effort  is  spent  on  model¬ 
building  it  then  seems  rather  unimportant  to  argue  about  the 
difference  between  5%  and  4%  at  the  end  of  the  analysis. 

(c)  The  p-value  is  only  one  of  a  large  number  of 
aspects  which  a  statistician  should  think  about  in  reaching 
his  conclusion.  It  is  not  a  formal  mechanism  e.g.  for 
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decision  making,  hut  simply  a  valuable  quide  to  the 
inductive  thouqht  processes. 

(d)  When  checkinq  a  model  it  is  impossible  to  have  all 
possible  alternatives  in  mind,  and  therefore  any  procedure 
which  conditions  upon  alternative  models  must  be  inadequate, 
thus,  for  example,  rulinq  out  any  Bayesian  procedure. 

In  summary,  whilst  tests  for  fit  might  be  viewed  as 
more  appropriate  than  tests  for  parameters  within  a  model, 
the  biq  question  is  whether  or  not  they  indeed  produce  the 
goods  i.e.  do  they  provide  a  completely  acceptable  procedure 
for  model-checking  in  the  absence  of  alternative  hypotheses, 
or  is  more  structure  needed  in  order  to  arrive  at  really 
convincing  conclusions?  In  other  words,  can  p-values  for 
tests  for  fit  be  interpreted  in  a  meaningful  way,  or  is  it 
simply  too  ambitious  to  hope  to  check  a  model 
unconditionally  upon  possible  alternatives?  My  personal 
opinion  would  be  a  bit  on  the  negative  side  but  I  would  be 
prepared  to  be  convinced  either  way.  I  challenge 
significance  testers  present  to  advise  me  how  they  actually 
make  a  practical  judgement  about  a  p-valuej  I  remain 
unconvinced  that  they  do  much  more  than  think  in  terms  of  1% 
and  5%. 

DISCOVERY  AND  INSIGHT  AS  OBJECTIVES  OF  THE  SCIENTIFIC  METHOD 

A  primary  purpose  of  statistics  is  to  discover  new 
real-life  conclusions  e.g.  a  possible  association  between 
important  medical  factors,  new  chemical  components  useful 
in,  say,  agriculture,  or  novel  ways  of  stimulating  the 
economy.  Statistics  also  plays  a  partly  confirmatory  role, 
but  this  is  secondary  to  discovery.  I  view  insight  as 
closely  related  to  discovery,  and  insight  and  discovery  are 
perhaps  of  equal  importance.  Inductive  modelling  combined 
with  local  deduction  takes  statistics  out  of  the 
unreasonable  restrictiveness  of  the  Neyman-Pearson  and 
coherent  Bayesian  areas,  and  into  the  forefront  of  science, 
as  an  important  vehicle  for  insight  and  discovery. 

Professor  Box  prefers  a  Bayes/f reguentist  compromise  as 
a  means  of  describing  his  deductive  and  inductive  reasoning. 


I  prefer  a  praqmatic  Bayes/praqmatic  Bayes  compromise.  I 
would  for  example  always  try  to  work  with  at  least  the 
conceptual  hackqround  of  a  prior  distribution  across  the 
space  of  samplinq  models,  and  perhaps  to  employ  a  praqmatic 
short-cut  to  approximate  to  a  full  blown  non-parametric 
Bayesian  procedure.  For  example,  Schwarz’s  criteiion 
provides  an  excellent  praomatic  method  for  judging  the 
deqree  of  a  polynomial  approximation  to  a  non-parametrised 
regression  function  or  samplinq  density.  In  short,  I  have 
developed  my  own  praomatic  Bayes /non -pa r a  me trie  Bayes 
procedures  for  coping  with  modellina  situations,  and  these 
will  be  reported  in  detail  elsewhere.  It  is  for  example 
possible  for  the  statistician  to  introduce  a  hypothesized 
model  as  prior  estimate,  and  then  to  let  the  data  help  him 
to  find  possible  deviations  from  his  hypothesized  model. 

This  ties  in  well  with  the  dedueti ve/inductive  scheme.  (See 
my  course  notes  on  Bayesian  Inference  and  Modellinq.  ) 

Unlike  Professor  Box,  I  do  not  view  the  prior 
distribution  of  the  parameters  as  part  of  the  sampling 
model.  Under  a  Bayesian  non-parametric  procedure  there  is 
no  restriction  to  the  type  of  sampling  model  which  can  be 
considered  or  to  the  type  of  discovery  which  can  be  made.  I 
however  find  Professor  Box's  frequentist  compromise  to  be  of 
potential  importance  both  in  stimulating  tremendous  input 
into  the  modellinq  area,  and  in  suqqestinq  that  we  should 
check  the  reasonability  of  the  prior  (e.q.  in  its  tails)  as 
«ell  as  the  reasonability  of  the  samplinq  model.  These  are 
of  course  two  separate  problems.  Perhaps  the  prior  should 
be  checked  by  investigating  the  properties  of  the  estimates 
it  leads  to. 


SECTION  2s  REVIEW  OF  SESSIONS  4-13 
Session  4,  Some  Thoughts  on  Data  Analysis, 

Bernard  Si)verman  (4/13/81  ) 

In  the  fourth  session  the  presentation  of  statistical 
data  was  discussed,  and  a  method  based  upon  kernel 
estimators  was  proposed  for  representing  a  random  sample  by 
a  smooth  curve.  Whilst  this  provides  concise 
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representations,  some  of  the  information  in  the  sample  will 
be  lost.  There  was  a  debate  about  the  merits  of  kernels  and 
histoqrans,  with  histoqrams  qaininq  a  sliqht  advantage. 

Purinq  this  session  there  was  also  a  debate  about 
whether  anyone  had  ever  actually  analysed  a  random  sample. 
The  consensus  of  opinion  seemed  to  be  that  whilst  some 
random  samples  have  at  times  occurred  in  designed 
experiments,  most  samples  have  non-random  cha ra c te r i s t i cs . 
Session  5,  Randomisation,  Jeff  Wu  (2/20/81) 

In  the  fifth  session  the  merits  of  randomisation  were 
debated.  It  seemed  to  be  the  qeneral  opinion  of  both 
Bayesians  and  frequentists  that  randomisation  is  an 
invaluable  device.  It  for  example  removes  bias  due  to 
factors  which  would  be  difficult  to  model  precisely,  and 
also  helps  the  statistician  to  cope  with  the  problem  of  the 
lurking  variable. 

The  only  point  of  debate  was  whether  the  analysis 
should  be  carried  out  conditionally  or  unconditionally  upon 
the  actual  desiqn  employed.  This  issue  parallels  the  debate 
on  the  Likelihood  and  Conditionality  Principles. 

The  problems  of  how  to  hunt  out  lurking  variables,  or 
how  to  analyse  data  in  the  presence  of  lurking  variables  is 
one  of  the  most  important  real  issues  which  statisticians 
are  faced  with,  particularly  when  analvsinq,  say,  medical  or 
economic  data,  rather  than  data  from  designed  experiments. 

It  should  probably  receive  much  more  attention  than,  say, 
the  frequentist/Bayes  philosophy. 

Session  6,  Robust  Designs,  Agnes  Herzberg  (2/27/81) 

In  the  sixth  session  robust  designs  were  discussed  with 
emphasis  on  the  criterion  of  P-optima li ty .  It  was  generally 
aqreed  that 

(a)  The  theory  of  experimental  design  should  always  be 
mixed  with  practical  common  sense,  and  that  a  praqmatic 
design  is  often  more  useful  than  a  theoretically  optimal 
desiqn,  particularly  when  model  inadequacies  are  taken  into 


account. 
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(b)  That  the  criterion  of  D-o  pt  i  ma  1  i  ty  is  just  one  way 

T 

of  summarizing  the  elements  of  the  XX  matrix  based  upon 
the  X  matrix  for  the  assumed  true  model,  so  that  D-optimal 
desiqns  should  be  treated  with  a  great  deal  of  caution. 
Recent  work  by  Toby  Mitchell  and  C.  F.  Wu  on  robu s ti f i ca t i on 
of  desiqns  may  also  be  useful  here. 

Session  7,  The  Frontiers  of  Statistical  Analysis, 

Brian  Joiner  (3/6/81) 

In  the  seventh  session  the  main  point  discussed  was 
whether  it  is  useful  to  discuss  sliqht  differences  between 
statistical  methodologies  when  the  most  serious  problem  with 
larqe  data  sets  is  whether  they  have  been  collected 
properly,  or  stored  properly  on  the  computer,  or  whether  it 
is  possible  to  obtain  convenient  summaries  of  the  data  set 
for  a  preliminary  analysis.  A  number  of  data  sets  were 
presented  in  order  to  illustrate  various  pitfalls  that  may 
be  caused  by  careless  da ta -h and  1 i nq . 

There  seem  to  be  two  separate  problems  here;  clearly 
data  handlinq  merits  considerable  attention  particularly 
when  90%  of  any  statistical  analysis  should  involve  careful 
consideration  of  the  data,  for  example  ueinq  scatter  plots 
and  cross-tabulations.  However,  havinq  done  this  we  still 
need  a  decent  formal  analysis  in  order  to  sort  out  the 
statistical  variation  in  the  data.  So  good  data  handlinq 
and  qood  statistical  methodology  are  both  of  essential 
importance . 

There  seems  to  be  some  doubt  as  to  the  wisdom  of 
collecting  larqe  quantities  of  badly  handled  data,  when  only 
a  small  proportion  of  it  may  ever  get  analysed.  Perhaps  the 
philosophy  "The  qreater  the  amount  of  information  the  less 
you  know",  is  not  completely  out  of  place  here. 

Session  B.  Subjective  Probability  for  Data  Problems, 

Jim  Dickey  ( 3/1  3/BI  ) 

In  the  eighth  session  we  switched  to  subjective 


probabiity  for  no  data  problems?  and  discussed  the 
elicitation  of  prior  distributions  from  non -s ta ti s t i ca 1 
experts.  This  is  a  growing  area  amongst  a  certain  breed  of 


Bayesians,  and  there  has  been  some  progress  for  single 
parameter  problems.  However,  severe  difficulties  are  faced 
in  multi-parameter  situations  because  of  the  problem  of 
quantifying  the  possibly  nonlinear  interdependencies  between 
different  parameters.  So  the  problem  shows  some  capability 
of  solution,  but  needs  considerable  more  development. 
Procedures  suggested  for  ensuring  coherence  don't  always 
seem  to  be  completely  coherent  themselves  e.g.  there  is 
often  a  heavy  dependence  on  least  squares. 

Session  9.  Education  in  Statistics,  Conrad  Fung  (3/27/B1) 

A  number  of  points  relating  to  education  in  Statistics 
were  discussed  in  the  ninth  session.  It  was  for  example 
felt  that  statistical  teaching  should  relate  both  to  current 
applications  of  our  methods  and  to  the  future  careers  of  our 
students  e.g.  in  industry. 

This  seems  to  be  of  considerable  importance  because  the 
statistics  we  are  teaching  now  is  the  statistics  which  is 
going  to  be  applied  in  industry,  maybe  for  the  next  forty 
years.  Perhaps  we  need  a  moratorium  on  all  "bad* 
statistical  methods  (confidence  intervals,  and  UMP  tests?), 
so  that  only  "good"  methods  (pragmatic  Bayes?)  survive  into 
the  next  century. 

Session  10,  Sequential  Analysis,  Connie  Shapiro  (4/3/81) 

In  the  tenth  session  the  theory  and  practical  relevance 
of  sequential  methods  were  discussed.  The  applicability  of 
the  Likelihood  Principle  was  debated  in  the  context  of  the 
variety  of  stopping  rules  available.  Another  important 
point  is  that,  whilst  an  optimal  Bayes  solution  is  always 
available,  the  extensive  analysis  may  be  extremely 
computationally  complicated  so  that  only  approximate  rules 
are  feasible.  Also,  in  practical  situations  it  is  generally 
infeasible  to  make  the  assumptions  necessary  for  sequential 
analysis,  and  a  pragmatic  rule  will  often  work  better. 
Furthermore,  the  advantages  in  using  a  sequential  rule  may 
be  diminished  when  model  inadequacy  is  taken  into  account. 


Session  11,  The  Truth  About  Bayesian  Inference, 

Steve  Stigler  (4/10/81  ) 

In  the  eleventh  session,  the  feasibility  of  judqing 
prior  opinions  via  the  predictive  distribution  was 
discussed,  with  historical  references  to  Rev.  Thomas  Bayer' 
original  paper.  It  was  suggested  that  a  serious  difficulty 
is  caused  for  Bayes  because,  for  a  given  sampling 
distribution,  there  may  be  no  prior  distribution 
corresponding  to  the  predictive  distribution  selected. 
However  this  simply  means  that  the  predictive  and  samplinq 
distributions  have  not  been  chosen  sensibly,  and  therefore 
provides  a  coherency  check. 

Session  12,  Rounding  Errors  in  Regression, 

Don  Rubin  (4/13/81) 

In  the  twelfth  session  we  discussed  an  asymptotic  Bayes 
method  for  rounding  errors  which  makes  opposite  adjustments 
to  those  suggested  by  numerical  analysis.  This  is  because 
the  posterior  distribution  of  the  rounding  errors  is  not 
locally  uniform  since  it  incorporates  knowledge  of  the 
regression  line.  This  is  an  excellent  example  of  a 
situation  where  Bayes  and  praqmatism  can  be  mixed  to  good 
effect. 

Session  13,  Exchangeability  in  Statistics, 

Dennis  Lindley  (4/24/81) 

In  the  thirteenth  session  we  discussed  the  idea  of 
conditional  exchangeability  of  observations  as  a  Bayesian 
method  for  interpreting  data.  This  for  example  leads  to  a 
resolution  of  Simpson’s  paradox.  It  also  highlights  the 
Bayesian  theme  that  it  is  necessary  to  utilize  information 
concerning  the  background  of  the  data  (e.g.  when  deciding 
which  factor  to  condition  on)  if  we  are  to  have  any  hope  of 
drawing  meaningful  conclusions  from  a  finite  number  of 
observations 


SECTION  3: 


REVIEW  OF  CLOSING  SESSION,  TOGETHER  WITH 
FURTHER  IDEAS  ON  STATISTICAL  ETHICS 


Session  14,  A  Review  Session  of  the  Bull  Sessions, 

Ton  Leonard  (5/3/81  ) 

My  overall  feeling  is  that  an  ideal  statistician  (a) 
relies  on  his  common  sense  and  praonatic  judqement,  (h)  gets 
involved  in  the  scientific  background  of  the  data,  (c)  is 
prepared  to  use  theory  when  it  is  likely  to  help  him  reach  a 
useful  conclusion,  (d)  is  unwillinq  to  accept  any 
theoretical  procedure  unless  he  is  convinced  that  it  is 
practically  relevant,  (e)  is  at  least  partly  Bayesian. 

I  would  like  to  predict  that  in  the  next  century 
statisticians  will  be  one-third  Bayesian,  one-third  data 
analyst,  and  one-third  scientist,  i.e.  they  will  view 
statistical  theory  and  practice  and  scientific  background  as 
a  single  entity. 

I  would  like  to  conclude  with  some  comments  on  the  role 
of  ethics  in  Statistics. 

Statistical  Ethics 

A  statistical  procedure  could  be  said  to  be  ethical  if 
it  has  a  beneficial  effect  on  the  people  (e.q.  bourqoisie) 
on  whom  it  is  likely  to  have  an  effect. 

Ideas  of  ethicity  seem  to  be  of  growing  importance  in 
Statistics,  for  example  in  medicine  and  education.  I  think 
that  the  profession  should  view  itself  as  responsible  for 
developing  ethical  standards  to  cover  the  effects  of 
statistics  on  ordinary  people. 

A  procedure  could  he  said  to  be  irrelevant  if  it  is  not 
constructed  with  ethicity  in  mind. 

A  procedure  could  be  said  to  be  unethical  if  it  is 
constructed  contrary  to  the  definition  of  ethicity. 

Proposi tion ;  In  the  final  analysis,  the  worthiness  of  any 
statistical  procedure  may  be  based  solely  upon  consideration 
as  to  whether  or  not  it  is  ethical. 

Definition:  A  procedure  possesses  douhle  standards  if  it 

purports  to  be  ethical,  but  in  fact  and  deed  either 
irrelevant  or  unethical. 
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The  primary  example  of  procedures  which  possess  double 
standards  are  the  types  of  siqnificance  tests  currently 
employed  in,  sav,  sociology  or  psychology,  where  whole 
professions  can  be  misguided  by  the  whims  of  the  "objective" 
accept/reject  philosophy  at  the  5%  level.  The  following 
proposition  might  also  be  worth  considering: 

Proposi tion :  Some  of  the  very  extreme  forms  of  coherent 
Bayesian  philosophy  run  the  risk  of  possessing  double 
standards  unless  they  seek  quick  resuscitation  from 
practical  data,  the  scientific  environment,  and  real 
statistics.  At  first  siqht,  they  provide  us  with  all 
consuming  theories.  However,  upon  careful  scrutiny,  they 
are  irrelevant  and  misleading  in  actual  terms,  and  therefore 
have  an  unhelpful  effect  upon  scientific  investigation.  A 
prime  example  is  Bayesian  Decision  Theory  which  suffers  from 
both  the  ambiguities  of  the  Expected  Utility  Hypothesis  and 
severe  difficulties  in  basing  the  choice  of  loss  function 
upon  practical  reasoning. 

Finally,  I  would  like  to  suqgest  a  Bible  for  confirmed 
adherents  of  this  sort  of  philosophy.  This  is  "The  Search" 
by  C.  P.  Snow,  and  concerns  the  realities  and  unrealities  of 
scientific  investigation  together  with  the  unfortunate 
experiences  in  academia  of  a  graduate  student  with  bourgois 
scientific  attitudes  and  moral  standards.  It  is  my  personal 
belief  that  people  who  identify  with  this  student  may  well 
have  just  the  riqht  attitude  towards  scientific 
investigation.  Perhaps  we  are  all  C.  P.  Snow's  at  heart. 
Postscript:  The  statistical  discussion  series  recommenced 

in  the  Fall  of  1981,  and  new  ideas  were  presented  on  the 
topics  discussed  above.  The  next  six  talks  were: 

"Some  Approaches  to  Modelling"  by  Tom  Leonard 
"Time  Series  and  Outliers"  by  George  Tiao 
"The  Boundaries  of  Statistics"  by  Bob  Miller 
"Box's  Modelling  Approach  for  Payes-Stein  Problems"  by 
Kevin  Little. 

"The  Analysis  of  Finite  Populations"  by  Jeff  Wu 

and 
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■The  Analysis  of  Transformations  Revisited;  A  Rebuttal" 
by  Georqe  Box  with  further  talks  planned  by  Don  Rubin, 
Chinq-Shui  Cbenq,  Dennis  Cox,  and  Rick  Nordheim 

Tapes  are  available  for  all  these  talks,  which  include 
many  stimulating  discussions  toqether  with  a  number  of 
humorous  interludes. 
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